System-Call Based Problem Diagnosis for PVFS

نویسندگان

  • Michael P. Kasick
  • Keith A. Bare
  • Eugene E. Marinelli
  • Jiaqi Tan
  • Rajeev Gandhi
  • Priya Narasimhan
چکیده

We present a syscall-based approach to automatically diagnose performance problems, server-to-client propagated errors, and server crash/hang problems in PVFS. Our approach compares the statistical and semantic attributes of syscalls across PVFS servers in order to diagnose the culprit server, under these problems, for different file-system benchmarks—dd, PostMark and IOzone—in a PVFS cluster.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Behavior-Based Problem Localization for Parallel File Systems

We present a behavior-based problem-diagnosis approach for PVFS that analyzes a novel source of instrumentation—CPU instruction-pointer samples and function-call traces—to localize the faulty server and to enable root-cause analysis of the resource at fault. We validate our approach by injecting realistic storage and network problems into three different workloads (dd, IOzone, and PostMark) on ...

متن کامل

Diagnosing Performance Problems in Parallel File Systems

This work describes and compares two black-box approaches, using syscall statistics and OS-level performance metrics, to automatically diagnose different performance problems in parallel file systems. Both approaches rely on peer-comparison diagnosis to compare statistical attributes of relevant metrics across servers in order to indict the culprit node. An observation-based checklist is develo...

متن کامل

The Parallel Virtual File System for Commodity Clusters

One benefit of cluster computer architectures is the opportunity for large I/O bandwidths. High performance applications that require significant I/O throughput are increasingly of interest to the clustercomputing community. Parallel file systems are critical system software components that allow parallel applications to take advantage of the parallel I/O disk subsystems in a cluster architectu...

متن کامل

Black-Box Problem Diagnosis in Parallel File Systems

We focus on automatically diagnosing different performance problems in parallel file systems by identifying, gathering and analyzing OS-level, black-box performance metrics on every node in the cluster. Our peercomparison diagnosis approach compares the statistical attributes of these metrics across I/O servers, to identify the faulty node. We develop a root-cause analysis procedure that furthe...

متن کامل

Design and implementation of PVFS-PM: a cluster file system on SCore

This paper discusses the design and implementation of a cluster file system, called PVFS-PM, on the SCore cluster system software. This is the first attempt to implement a cluster file system on the SCore system. It is based on the PVFS cluster file system but replaces TCP with the PMv2 communication library supported by SCore to provide a scalable, high-performance cluster file system. PVFS-PM...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009